Overview

Dataset statistics

Number of variables31
Number of observations4807996
Missing cells13887314
Missing cells (%)9.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 GiB
Average record size in memory248.0 B

Variable types

Numeric15
Categorical16

Alerts

dat_cadastramento_fam has a high cardinality: 4960 distinct values High cardinality
dat_alteracao_fam has a high cardinality: 1364 distinct values High cardinality
dat_atualizacao_familia has a high cardinality: 1376 distinct values High cardinality
nom_estab_assist_saude_fam has a high cardinality: 25584 distinct values High cardinality
nom_centro_assist_fam has a high cardinality: 3699 distinct values High cardinality
cd_ibge is highly correlated with cod_centro_assist_famHigh correlation
estrato is highly correlated with id_familiaHigh correlation
classf is highly correlated with id_familiaHigh correlation
id_familia is highly correlated with estrato and 1 other fieldsHigh correlation
vlr_renda_media_fam is highly correlated with marc_pbfHigh correlation
cod_local_domic_fam is highly correlated with cod_abaste_agua_domic_fam and 2 other fieldsHigh correlation
qtd_comodos_domic_fam is highly correlated with qtd_comodos_dormitorio_famHigh correlation
qtd_comodos_dormitorio_fam is highly correlated with qtd_comodos_domic_famHigh correlation
cod_agua_canalizada_fam is highly correlated with cod_abaste_agua_domic_famHigh correlation
cod_abaste_agua_domic_fam is highly correlated with cod_local_domic_fam and 2 other fieldsHigh correlation
cod_destino_lixo_domic_fam is highly correlated with cod_local_domic_fam and 2 other fieldsHigh correlation
cod_calcamento_domic_fam is highly correlated with cod_local_domic_fam and 1 other fieldsHigh correlation
cod_centro_assist_fam is highly correlated with cd_ibgeHigh correlation
marc_pbf is highly correlated with vlr_renda_media_famHigh correlation
cd_ibge is highly correlated with cod_centro_assist_famHigh correlation
estrato is highly correlated with id_familiaHigh correlation
classf is highly correlated with id_familiaHigh correlation
id_familia is highly correlated with estrato and 1 other fieldsHigh correlation
vlr_renda_media_fam is highly correlated with marc_pbfHigh correlation
cod_local_domic_fam is highly correlated with cod_destino_lixo_domic_fam and 1 other fieldsHigh correlation
qtd_comodos_domic_fam is highly correlated with qtd_comodos_dormitorio_famHigh correlation
qtd_comodos_dormitorio_fam is highly correlated with qtd_comodos_domic_famHigh correlation
cod_agua_canalizada_fam is highly correlated with cod_abaste_agua_domic_famHigh correlation
cod_abaste_agua_domic_fam is highly correlated with cod_agua_canalizada_famHigh correlation
cod_destino_lixo_domic_fam is highly correlated with cod_local_domic_fam and 1 other fieldsHigh correlation
cod_calcamento_domic_fam is highly correlated with cod_local_domic_fam and 1 other fieldsHigh correlation
cod_centro_assist_fam is highly correlated with cd_ibgeHigh correlation
marc_pbf is highly correlated with vlr_renda_media_famHigh correlation
cd_ibge is highly correlated with cod_centro_assist_famHigh correlation
classf is highly correlated with id_familiaHigh correlation
id_familia is highly correlated with classfHigh correlation
vlr_renda_media_fam is highly correlated with marc_pbfHigh correlation
cod_local_domic_fam is highly correlated with cod_abaste_agua_domic_fam and 2 other fieldsHigh correlation
qtd_comodos_domic_fam is highly correlated with qtd_comodos_dormitorio_famHigh correlation
qtd_comodos_dormitorio_fam is highly correlated with qtd_comodos_domic_famHigh correlation
cod_agua_canalizada_fam is highly correlated with cod_abaste_agua_domic_famHigh correlation
cod_abaste_agua_domic_fam is highly correlated with cod_local_domic_fam and 1 other fieldsHigh correlation
cod_destino_lixo_domic_fam is highly correlated with cod_local_domic_famHigh correlation
cod_calcamento_domic_fam is highly correlated with cod_local_domic_famHigh correlation
cod_centro_assist_fam is highly correlated with cd_ibgeHigh correlation
marc_pbf is highly correlated with vlr_renda_media_famHigh correlation
cod_local_domic_fam is highly correlated with cod_abaste_agua_domic_fam and 1 other fieldsHigh correlation
cod_abaste_agua_domic_fam is highly correlated with cod_local_domic_fam and 2 other fieldsHigh correlation
cod_banheiro_domic_fam is highly correlated with cod_especie_domic_famHigh correlation
ind_familia_quilombola_fam is highly correlated with cod_familia_indigena_famHigh correlation
cod_especie_domic_fam is highly correlated with cod_abaste_agua_domic_fam and 3 other fieldsHigh correlation
cod_familia_indigena_fam is highly correlated with ind_familia_quilombola_famHigh correlation
cod_agua_canalizada_fam is highly correlated with cod_abaste_agua_domic_fam and 1 other fieldsHigh correlation
cod_calcamento_domic_fam is highly correlated with cod_local_domic_fam and 1 other fieldsHigh correlation
cd_ibge is highly correlated with id_familia and 2 other fieldsHigh correlation
estrato is highly correlated with id_familiaHigh correlation
classf is highly correlated with id_familiaHigh correlation
id_familia is highly correlated with cd_ibge and 3 other fieldsHigh correlation
vlr_renda_media_fam is highly correlated with marc_pbfHigh correlation
cod_local_domic_fam is highly correlated with cod_agua_canalizada_fam and 4 other fieldsHigh correlation
qtd_comodos_domic_fam is highly correlated with qtd_comodos_dormitorio_famHigh correlation
qtd_comodos_dormitorio_fam is highly correlated with qtd_comodos_domic_famHigh correlation
cod_material_piso_fam is highly correlated with cod_material_domic_famHigh correlation
cod_material_domic_fam is highly correlated with cd_ibge and 2 other fieldsHigh correlation
cod_agua_canalizada_fam is highly correlated with cod_local_domic_fam and 3 other fieldsHigh correlation
cod_abaste_agua_domic_fam is highly correlated with cod_local_domic_fam and 1 other fieldsHigh correlation
cod_banheiro_domic_fam is highly correlated with cod_agua_canalizada_fam and 1 other fieldsHigh correlation
cod_escoa_sanitario_domic_fam is highly correlated with cod_local_domic_fam and 1 other fieldsHigh correlation
cod_destino_lixo_domic_fam is highly correlated with cod_local_domic_fam and 3 other fieldsHigh correlation
cod_calcamento_domic_fam is highly correlated with cod_escoa_sanitario_domic_fam and 1 other fieldsHigh correlation
cod_centro_assist_fam is highly correlated with cd_ibge and 2 other fieldsHigh correlation
ind_parc_mds_fam is highly correlated with cod_local_domic_famHigh correlation
marc_pbf is highly correlated with vlr_renda_media_famHigh correlation
qtd_comodos_domic_fam has 224026 (4.7%) missing values Missing
qtd_comodos_dormitorio_fam has 222998 (4.6%) missing values Missing
cod_material_piso_fam has 222349 (4.6%) missing values Missing
cod_material_domic_fam has 222349 (4.6%) missing values Missing
cod_agua_canalizada_fam has 222349 (4.6%) missing values Missing
cod_abaste_agua_domic_fam has 222349 (4.6%) missing values Missing
cod_banheiro_domic_fam has 222349 (4.6%) missing values Missing
cod_escoa_sanitario_domic_fam has 494595 (10.3%) missing values Missing
cod_destino_lixo_domic_fam has 222349 (4.6%) missing values Missing
cod_iluminacao_domic_fam has 222349 (4.6%) missing values Missing
cod_calcamento_domic_fam has 222350 (4.6%) missing values Missing
nom_estab_assist_saude_fam has 2441370 (50.8%) missing values Missing
cod_eas_fam has 2441370 (50.8%) missing values Missing
nom_centro_assist_fam has 3031030 (63.0%) missing values Missing
cod_centro_assist_fam has 3031030 (63.0%) missing values Missing
ind_parc_mds_fam has 155334 (3.2%) missing values Missing
id_familia has unique values Unique
vlr_renda_media_fam has 494189 (10.3%) zeros Zeros
ind_parc_mds_fam has 4244511 (88.3%) zeros Zeros

Reproduction

Analysis started2022-08-24 23:16:32.974227
Analysis finished2022-08-24 23:34:01.508735
Duration17 minutes and 28.53 seconds
Software versionpandas-profiling v3.2.0
Download configurationconfig.json

Variables

cd_ibge
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct5534
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2993741.547
Minimum1100015
Maximum5300108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:01.767346image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1100015
5-th percentile1501303
Q12311875.25
median2927002
Q33526209
95-th percentile5101803
Maximum5300108
Range4200093
Interquartile range (IQR)1214333.75

Descriptive statistics

Standard deviation937280.9281
Coefficient of variation (CV)0.3130801085
Kurtosis0.1068486716
Mean2993741.547
Median Absolute Deviation (MAD)604801
Skewness0.393206125
Sum1.439389738 × 1013
Variance8.784955382 × 1011
MonotonicityNot monotonic
2022-08-24T20:34:01.874092image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3550308233230
 
4.9%
330455798360
 
2.0%
230440073396
 
1.5%
292740860027
 
1.2%
130260345467
 
0.9%
150140240950
 
0.9%
261160638405
 
0.8%
211130035663
 
0.7%
310620029279
 
0.6%
530010828538
 
0.6%
Other values (5524)4124681
85.8%
ValueCountFrequency (%)
1100015361
 
< 0.1%
11000232753
0.1%
110003155
 
< 0.1%
11000492259
< 0.1%
1100056271
 
< 0.1%
1100064181
 
< 0.1%
1100072111
 
< 0.1%
1100080287
 
< 0.1%
1100098435
 
< 0.1%
11001061471
< 0.1%
ValueCountFrequency (%)
530010828538
0.6%
5222302153
 
< 0.1%
522220350
 
< 0.1%
5222054169
 
< 0.1%
5222005175
 
< 0.1%
5221908108
 
< 0.1%
52218583275
 
0.1%
522180949
 
< 0.1%
5221700295
 
< 0.1%
52216011418
 
< 0.1%

estrato
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size36.7 MiB
2
4085981 
1
722015 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4807996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
24085981
85.0%
1722015
 
15.0%

Length

2022-08-24T20:34:01.986759image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-24T20:34:02.143543image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
24085981
85.0%
1722015
 
15.0%

Most occurring characters

ValueCountFrequency (%)
24085981
85.0%
1722015
 
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number4807996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
24085981
85.0%
1722015
 
15.0%

Most occurring scripts

ValueCountFrequency (%)
Common4807996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
24085981
85.0%
1722015
 
15.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII4807996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
24085981
85.0%
1722015
 
15.0%

classf
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size36.7 MiB
3
2856584 
2
1023594 
1
927818 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4807996
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
32856584
59.4%
21023594
 
21.3%
1927818
 
19.3%

Length

2022-08-24T20:34:02.212359image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-24T20:34:02.289154image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
32856584
59.4%
21023594
 
21.3%
1927818
 
19.3%

Most occurring characters

ValueCountFrequency (%)
32856584
59.4%
21023594
 
21.3%
1927818
 
19.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number4807996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
32856584
59.4%
21023594
 
21.3%
1927818
 
19.3%

Most occurring scripts

ValueCountFrequency (%)
Common4807996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
32856584
59.4%
21023594
 
21.3%
1927818
 
19.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII4807996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
32856584
59.4%
21023594
 
21.3%
1927818
 
19.3%

id_familia
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct4807996
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2709538.437
Minimum1
Maximum5290701
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:02.415815image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile284955.75
Q11393348.75
median2739219.5
Q34049591.25
95-th percentile5044434.25
Maximum5290701
Range5290700
Interquartile range (IQR)2656242.5

Descriptive statistics

Standard deviation1530538.528
Coefficient of variation (CV)0.5648705724
Kurtosis-1.204976871
Mean2709538.437
Median Absolute Deviation (MAD)1327558
Skewness-0.04921304009
Sum1.302744997 × 1013
Variance2.342548186 × 1012
MonotonicityStrictly increasing
2022-08-24T20:34:02.531506image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11
 
< 0.1%
36183831
 
< 0.1%
36184001
 
< 0.1%
36183991
 
< 0.1%
36183981
 
< 0.1%
36183971
 
< 0.1%
36183961
 
< 0.1%
36183951
 
< 0.1%
36183941
 
< 0.1%
36183921
 
< 0.1%
Other values (4807986)4807986
> 99.9%
ValueCountFrequency (%)
11
< 0.1%
31
< 0.1%
41
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
101
< 0.1%
111
< 0.1%
121
< 0.1%
ValueCountFrequency (%)
52907011
< 0.1%
52907001
< 0.1%
52906991
< 0.1%
52906981
< 0.1%
52906971
< 0.1%
52906961
< 0.1%
52906951
< 0.1%
52906941
< 0.1%
52906931
< 0.1%
52906921
< 0.1%

dat_cadastramento_fam
Categorical

HIGH CARDINALITY

Distinct4960
Distinct (%)0.1%
Missing1
Missing (%)< 0.1%
Memory size36.7 MiB
2003-03-13
 
144547
2002-08-18
 
9396
2003-08-04
 
8886
2002-05-22
 
8314
2002-07-20
 
8105
Other values (4955)
4628747 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters48079950
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique524 ?
Unique (%)< 0.1%

Sample

1st row2018-06-28
2nd row2018-08-27
3rd row2018-02-23
4th row2013-12-27
5th row2018-03-26

Common Values

ValueCountFrequency (%)
2003-03-13144547
 
3.0%
2002-08-189396
 
0.2%
2003-08-048886
 
0.2%
2002-05-228314
 
0.2%
2002-07-208105
 
0.2%
2006-04-087738
 
0.2%
2006-04-017421
 
0.2%
2006-08-197285
 
0.2%
2002-09-077157
 
0.1%
2002-07-056831
 
0.1%
Other values (4950)4592315
95.5%

Length

2022-08-24T20:34:02.642211image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2003-03-13144547
 
3.0%
2002-08-189396
 
0.2%
2003-08-048886
 
0.2%
2002-05-228314
 
0.2%
2002-07-208105
 
0.2%
2006-04-087738
 
0.2%
2006-04-017421
 
0.2%
2006-08-197285
 
0.2%
2002-09-077157
 
0.1%
2002-07-056831
 
0.1%
Other values (4950)4592315
95.5%

Most occurring characters

ValueCountFrequency (%)
012338158
25.7%
-9615990
20.0%
28090876
16.8%
17630928
15.9%
81850789
 
3.8%
31832746
 
3.8%
71682960
 
3.5%
61468942
 
3.1%
51280497
 
2.7%
41250563
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number38463960
80.0%
Dash Punctuation9615990
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
012338158
32.1%
28090876
21.0%
17630928
19.8%
81850789
 
4.8%
31832746
 
4.8%
71682960
 
4.4%
61468942
 
3.8%
51280497
 
3.3%
41250563
 
3.3%
91037501
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
-9615990
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common48079950
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
012338158
25.7%
-9615990
20.0%
28090876
16.8%
17630928
15.9%
81850789
 
3.8%
31832746
 
3.8%
71682960
 
3.5%
61468942
 
3.1%
51280497
 
2.7%
41250563
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII48079950
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
012338158
25.7%
-9615990
20.0%
28090876
16.8%
17630928
15.9%
81850789
 
3.8%
31832746
 
3.8%
71682960
 
3.5%
61468942
 
3.1%
51280497
 
2.7%
41250563
 
2.6%

dat_alteracao_fam
Categorical

HIGH CARDINALITY

Distinct1364
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size36.7 MiB
2018-10-01
1654010 
2018-09-30
1433585 
2018-10-02
218865 
2018-09-25
 
21591
2018-09-27
 
19269
Other values (1359)
1460676 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters48079960
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)< 0.1%

Sample

1st row2018-10-02
2nd row2018-11-29
3rd row2018-02-27
4th row2018-10-01
5th row2018-03-28

Common Values

ValueCountFrequency (%)
2018-10-011654010
34.4%
2018-09-301433585
29.8%
2018-10-02218865
 
4.6%
2018-09-2521591
 
0.4%
2018-09-2719269
 
0.4%
2018-11-1316625
 
0.3%
2018-11-2716263
 
0.3%
2018-11-2816141
 
0.3%
2018-12-1116012
 
0.3%
2018-12-0415902
 
0.3%
Other values (1354)1379733
28.7%

Length

2022-08-24T20:34:02.718038image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2018-10-011654010
34.4%
2018-09-301433585
29.8%
2018-10-02218865
 
4.6%
2018-09-2521591
 
0.4%
2018-09-2719269
 
0.4%
2018-11-1316625
 
0.3%
2018-11-2716263
 
0.3%
2018-11-2816141
 
0.3%
2018-12-1116012
 
0.3%
2018-12-0415902
 
0.3%
Other values (1354)1379733
28.7%

Most occurring characters

ValueCountFrequency (%)
012944259
26.9%
110110933
21.0%
-9615992
20.0%
25981452
12.4%
84521601
 
9.4%
91755763
 
3.7%
31751997
 
3.6%
7431774
 
0.9%
6385296
 
0.8%
5352701
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number38463968
80.0%
Dash Punctuation9615992
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
012944259
33.7%
110110933
26.3%
25981452
15.6%
84521601
 
11.8%
91755763
 
4.6%
31751997
 
4.6%
7431774
 
1.1%
6385296
 
1.0%
5352701
 
0.9%
4228192
 
0.6%
Dash Punctuation
ValueCountFrequency (%)
-9615992
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common48079960
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
012944259
26.9%
110110933
21.0%
-9615992
20.0%
25981452
12.4%
84521601
 
9.4%
91755763
 
3.7%
31751997
 
3.6%
7431774
 
0.9%
6385296
 
0.8%
5352701
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII48079960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
012944259
26.9%
110110933
21.0%
-9615992
20.0%
25981452
12.4%
84521601
 
9.4%
91755763
 
3.7%
31751997
 
3.6%
7431774
 
0.9%
6385296
 
0.8%
5352701
 
0.7%

vlr_renda_media_fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct2806
Distinct (%)0.1%
Missing138
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean279.7731331
Minimum0
Maximum2862
Zeros494189
Zeros (%)10.3%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:02.810792image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q133
median100
Q3440
95-th percentile954
Maximum2862
Range2862
Interquartile range (IQR)407

Descriptive statistics

Standard deviation350.8036979
Coefficient of variation (CV)1.253886297
Kurtosis3.140338859
Mean279.7731331
Median Absolute Deviation (MAD)100
Skewness1.685596758
Sum1345109496
Variance123063.2345
MonotonicityNot monotonic
2022-08-24T20:34:02.916507image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0494189
 
10.3%
954212385
 
4.4%
50207430
 
4.3%
937200318
 
4.2%
100126628
 
2.6%
75112083
 
2.3%
477103440
 
2.2%
6691883
 
1.9%
2582715
 
1.7%
3378055
 
1.6%
Other values (2796)3098732
64.4%
ValueCountFrequency (%)
0494189
10.3%
112608
 
0.3%
221656
 
0.5%
315449
 
0.3%
423636
 
0.5%
526443
 
0.5%
625292
 
0.5%
79765
 
0.2%
838369
 
0.8%
95884
 
0.1%
ValueCountFrequency (%)
286261
< 0.1%
28612
 
< 0.1%
28605
 
< 0.1%
28594
 
< 0.1%
28581
 
< 0.1%
28572
 
< 0.1%
28551
 
< 0.1%
28544
 
< 0.1%
28531
 
< 0.1%
28521
 
< 0.1%

dat_atualizacao_familia
Categorical

HIGH CARDINALITY

Distinct1376
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size36.7 MiB
2018-09-13
 
17127
2018-09-11
 
16350
2018-09-12
 
16257
2018-11-13
 
15519
2018-09-04
 
15451
Other values (1371)
4727292 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters48079960
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)< 0.1%

Sample

1st row2018-06-28
2nd row2018-11-29
3rd row2018-02-23
4th row2017-06-22
5th row2018-03-26

Common Values

ValueCountFrequency (%)
2018-09-1317127
 
0.4%
2018-09-1116350
 
0.3%
2018-09-1216257
 
0.3%
2018-11-1315519
 
0.3%
2018-09-0415451
 
0.3%
2018-11-2814942
 
0.3%
2018-09-1014898
 
0.3%
2018-08-0714884
 
0.3%
2018-11-1214855
 
0.3%
2018-11-2114852
 
0.3%
Other values (1366)4652861
96.8%

Length

2022-08-24T20:34:03.012220image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2018-09-1317127
 
0.4%
2018-09-1116350
 
0.3%
2018-09-1216257
 
0.3%
2018-11-1315519
 
0.3%
2018-09-0415451
 
0.3%
2018-11-2814942
 
0.3%
2018-09-1014898
 
0.3%
2018-08-0714884
 
0.3%
2018-11-1214855
 
0.3%
2018-11-2114852
 
0.3%
Other values (1366)4652861
96.8%

Most occurring characters

ValueCountFrequency (%)
010567252
22.0%
-9615992
20.0%
19105399
18.9%
27526830
15.7%
83524905
 
7.3%
72357540
 
4.9%
61445419
 
3.0%
31097880
 
2.3%
51058624
 
2.2%
9932588
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number38463968
80.0%
Dash Punctuation9615992
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
010567252
27.5%
19105399
23.7%
27526830
19.6%
83524905
 
9.2%
72357540
 
6.1%
61445419
 
3.8%
31097880
 
2.9%
51058624
 
2.8%
9932588
 
2.4%
4847531
 
2.2%
Dash Punctuation
ValueCountFrequency (%)
-9615992
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common48079960
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
010567252
22.0%
-9615992
20.0%
19105399
18.9%
27526830
15.7%
83524905
 
7.3%
72357540
 
4.9%
61445419
 
3.0%
31097880
 
2.3%
51058624
 
2.2%
9932588
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII48079960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
010567252
22.0%
-9615992
20.0%
19105399
18.9%
27526830
15.7%
83524905
 
7.3%
72357540
 
4.9%
61445419
 
3.0%
31097880
 
2.3%
51058624
 
2.2%
9932588
 
1.9%

cod_local_domic_fam
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing20727
Missing (%)0.4%
Memory size36.7 MiB
1.0
3852815 
2.0
934454 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters14361807
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.03852815
80.1%
2.0934454
 
19.4%
(Missing)20727
 
0.4%

Length

2022-08-24T20:34:03.088044image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-24T20:34:03.165808image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.03852815
80.5%
2.0934454
 
19.5%

Most occurring characters

ValueCountFrequency (%)
.4787269
33.3%
04787269
33.3%
13852815
26.8%
2934454
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number9574538
66.7%
Other Punctuation4787269
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04787269
50.0%
13852815
40.2%
2934454
 
9.8%
Other Punctuation
ValueCountFrequency (%)
.4787269
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common14361807
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.4787269
33.3%
04787269
33.3%
13852815
26.8%
2934454
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII14361807
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.4787269
33.3%
04787269
33.3%
13852815
26.8%
2934454
 
6.5%

cod_especie_domic_fam
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing20730
Missing (%)0.4%
Memory size36.7 MiB
1.0
4585649 
2.0
 
161167
3.0
 
40450

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters14361798
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.04585649
95.4%
2.0161167
 
3.4%
3.040450
 
0.8%
(Missing)20730
 
0.4%

Length

2022-08-24T20:34:03.233627image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-24T20:34:03.310422image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.04585649
95.8%
2.0161167
 
3.4%
3.040450
 
0.8%

Most occurring characters

ValueCountFrequency (%)
.4787266
33.3%
04787266
33.3%
14585649
31.9%
2161167
 
1.1%
340450
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number9574532
66.7%
Other Punctuation4787266
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04787266
50.0%
14585649
47.9%
2161167
 
1.7%
340450
 
0.4%
Other Punctuation
ValueCountFrequency (%)
.4787266
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common14361798
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.4787266
33.3%
04787266
33.3%
14585649
31.9%
2161167
 
1.1%
340450
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII14361798
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.4787266
33.3%
04787266
33.3%
14585649
31.9%
2161167
 
1.1%
340450
 
0.3%

qtd_comodos_domic_fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct21
Distinct (%)< 0.1%
Missing224026
Missing (%)4.7%
Infinite0
Infinite (%)0.0%
Mean4.418609851
Minimum0
Maximum20
Zeros184
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:03.379237image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q14
median5
Q35
95-th percentile7
Maximum20
Range20
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.380896982
Coefficient of variation (CV)0.3125184229
Kurtosis1.64972056
Mean4.418609851
Median Absolute Deviation (MAD)1
Skewness0.2039207811
Sum20254775
Variance1.906876475
MonotonicityNot monotonic
2022-08-24T20:34:03.465039image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
51618757
33.7%
41114384
23.2%
3700405
14.6%
6525483
 
10.9%
2310040
 
6.4%
7150272
 
3.1%
183307
 
1.7%
854874
 
1.1%
915828
 
0.3%
106554
 
0.1%
Other values (11)4066
 
0.1%
(Missing)224026
 
4.7%
ValueCountFrequency (%)
0184
 
< 0.1%
183307
 
1.7%
2310040
 
6.4%
3700405
14.6%
41114384
23.2%
51618757
33.7%
6525483
 
10.9%
7150272
 
3.1%
854874
 
1.1%
915828
 
0.3%
ValueCountFrequency (%)
2069
 
< 0.1%
1920
 
< 0.1%
1834
 
< 0.1%
1727
 
< 0.1%
1655
 
< 0.1%
15107
 
< 0.1%
14187
 
< 0.1%
13392
 
< 0.1%
121135
< 0.1%
111856
< 0.1%

qtd_comodos_dormitorio_fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct20
Distinct (%)< 0.1%
Missing222998
Missing (%)4.6%
Infinite0
Infinite (%)0.0%
Mean1.775068386
Minimum0
Maximum20
Zeros1999
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:03.559755image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q32
95-th percentile3
Maximum20
Range20
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7538018879
Coefficient of variation (CV)0.4246607589
Kurtosis17.08510656
Mean1.775068386
Median Absolute Deviation (MAD)1
Skewness1.569846723
Sum8138685
Variance0.5682172861
MonotonicityNot monotonic
2022-08-24T20:34:03.642560image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
22175147
45.2%
11766770
36.7%
3566551
 
11.8%
460978
 
1.3%
59646
 
0.2%
62382
 
< 0.1%
01999
 
< 0.1%
7553
 
< 0.1%
8240
 
< 0.1%
12162
 
< 0.1%
Other values (10)570
 
< 0.1%
(Missing)222998
 
4.6%
ValueCountFrequency (%)
01999
 
< 0.1%
11766770
36.7%
22175147
45.2%
3566551
 
11.8%
460978
 
1.3%
59646
 
0.2%
62382
 
< 0.1%
7553
 
< 0.1%
8240
 
< 0.1%
970
 
< 0.1%
ValueCountFrequency (%)
20157
< 0.1%
183
 
< 0.1%
175
 
< 0.1%
161
 
< 0.1%
1547
 
< 0.1%
1443
 
< 0.1%
1322
 
< 0.1%
12162
< 0.1%
1161
 
< 0.1%
10161
< 0.1%

cod_material_piso_fam
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct7
Distinct (%)< 0.1%
Missing222349
Missing (%)4.6%
Infinite0
Infinite (%)0.0%
Mean3.586297637
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:03.722351image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12
median5
Q35
95-th percentile5
Maximum7
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.537008739
Coefficient of variation (CV)0.4285781312
Kurtosis-1.715803855
Mean3.586297637
Median Absolute Deviation (MAD)0
Skewness-0.1741459638
Sum16445495
Variance2.362395864
MonotonicityNot monotonic
2022-08-24T20:34:03.786149image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
52302685
47.9%
21829595
38.1%
1185782
 
3.9%
4167288
 
3.5%
370236
 
1.5%
726872
 
0.6%
63189
 
0.1%
(Missing)222349
 
4.6%
ValueCountFrequency (%)
1185782
 
3.9%
21829595
38.1%
370236
 
1.5%
4167288
 
3.5%
52302685
47.9%
63189
 
0.1%
726872
 
0.6%
ValueCountFrequency (%)
726872
 
0.6%
63189
 
0.1%
52302685
47.9%
4167288
 
3.5%
370236
 
1.5%
21829595
38.1%
1185782
 
3.9%

cod_material_domic_fam
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct8
Distinct (%)< 0.1%
Missing222349
Missing (%)4.6%
Infinite0
Infinite (%)0.0%
Mean1.540680955
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:03.856991image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile4
Maximum8
Range7
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.233464673
Coefficient of variation (CV)0.8005970794
Kurtosis11.39308843
Mean1.540680955
Median Absolute Deviation (MAD)0
Skewness3.219770058
Sum7065019
Variance1.5214351
MonotonicityNot monotonic
2022-08-24T20:34:03.930762image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
13365529
70.0%
2690555
 
14.4%
3274208
 
5.7%
676891
 
1.6%
562766
 
1.3%
860505
 
1.3%
449937
 
1.0%
75256
 
0.1%
(Missing)222349
 
4.6%
ValueCountFrequency (%)
13365529
70.0%
2690555
 
14.4%
3274208
 
5.7%
449937
 
1.0%
562766
 
1.3%
676891
 
1.6%
75256
 
0.1%
860505
 
1.3%
ValueCountFrequency (%)
860505
 
1.3%
75256
 
0.1%
676891
 
1.6%
562766
 
1.3%
449937
 
1.0%
3274208
 
5.7%
2690555
 
14.4%
13365529
70.0%

cod_agua_canalizada_fam
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing222349
Missing (%)4.6%
Memory size36.7 MiB
1.0
4023381 
2.0
562266 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters13756941
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row2.0

Common Values

ValueCountFrequency (%)
1.04023381
83.7%
2.0562266
 
11.7%
(Missing)222349
 
4.6%

Length

2022-08-24T20:34:04.016564image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-24T20:34:04.095322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.04023381
87.7%
2.0562266
 
12.3%

Most occurring characters

ValueCountFrequency (%)
.4585647
33.3%
04585647
33.3%
14023381
29.2%
2562266
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number9171294
66.7%
Other Punctuation4585647
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04585647
50.0%
14023381
43.9%
2562266
 
6.1%
Other Punctuation
ValueCountFrequency (%)
.4585647
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common13756941
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.4585647
33.3%
04585647
33.3%
14023381
29.2%
2562266
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII13756941
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.4585647
33.3%
04585647
33.3%
14023381
29.2%
2562266
 
4.1%

cod_abaste_agua_domic_fam
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct4
Distinct (%)< 0.1%
Missing222349
Missing (%)4.6%
Memory size36.7 MiB
1.0
3537937 
2.0
693380 
4.0
 
211057
3.0
 
143273

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters13756941
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row4.0

Common Values

ValueCountFrequency (%)
1.03537937
73.6%
2.0693380
 
14.4%
4.0211057
 
4.4%
3.0143273
 
3.0%
(Missing)222349
 
4.6%

Length

2022-08-24T20:34:04.164137image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-24T20:34:04.241930image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.03537937
77.2%
2.0693380
 
15.1%
4.0211057
 
4.6%
3.0143273
 
3.1%

Most occurring characters

ValueCountFrequency (%)
.4585647
33.3%
04585647
33.3%
13537937
25.7%
2693380
 
5.0%
4211057
 
1.5%
3143273
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number9171294
66.7%
Other Punctuation4585647
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04585647
50.0%
13537937
38.6%
2693380
 
7.6%
4211057
 
2.3%
3143273
 
1.6%
Other Punctuation
ValueCountFrequency (%)
.4585647
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common13756941
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.4585647
33.3%
04585647
33.3%
13537937
25.7%
2693380
 
5.0%
4211057
 
1.5%
3143273
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII13756941
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.4585647
33.3%
04585647
33.3%
13537937
25.7%
2693380
 
5.0%
4211057
 
1.5%
3143273
 
1.0%

cod_banheiro_domic_fam
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing222349
Missing (%)4.6%
Memory size36.7 MiB
1.0
4313401 
2.0
 
272246

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters13756941
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.04313401
89.7%
2.0272246
 
5.7%
(Missing)222349
 
4.6%

Length

2022-08-24T20:34:04.314766image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-24T20:34:04.387573image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.04313401
94.1%
2.0272246
 
5.9%

Most occurring characters

ValueCountFrequency (%)
.4585647
33.3%
04585647
33.3%
14313401
31.4%
2272246
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number9171294
66.7%
Other Punctuation4585647
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04585647
50.0%
14313401
47.0%
2272246
 
3.0%
Other Punctuation
ValueCountFrequency (%)
.4585647
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common13756941
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.4585647
33.3%
04585647
33.3%
14313401
31.4%
2272246
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII13756941
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.4585647
33.3%
04585647
33.3%
14313401
31.4%
2272246
 
2.0%

cod_escoa_sanitario_domic_fam
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct6
Distinct (%)< 0.1%
Missing494595
Missing (%)10.3%
Infinite0
Infinite (%)0.0%
Mean1.864986585
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:04.443390image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile3
Maximum6
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.04023785
Coefficient of variation (CV)0.5577722962
Kurtosis0.6721160173
Mean1.864986585
Median Absolute Deviation (MAD)0
Skewness0.9832491488
Sum8044435
Variance1.082094784
MonotonicityNot monotonic
2022-08-24T20:34:04.512237image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
12257758
47.0%
31244054
25.9%
2648989
 
13.5%
488082
 
1.8%
542899
 
0.9%
631619
 
0.7%
(Missing)494595
 
10.3%
ValueCountFrequency (%)
12257758
47.0%
2648989
 
13.5%
31244054
25.9%
488082
 
1.8%
542899
 
0.9%
631619
 
0.7%
ValueCountFrequency (%)
631619
 
0.7%
542899
 
0.9%
488082
 
1.8%
31244054
25.9%
2648989
 
13.5%
12257758
47.0%

cod_destino_lixo_domic_fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct6
Distinct (%)< 0.1%
Missing222349
Missing (%)4.6%
Infinite0
Infinite (%)0.0%
Mean1.409208123
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:04.580033image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum6
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.8344851491
Coefficient of variation (CV)0.5921660085
Kurtosis3.966479432
Mean1.409208123
Median Absolute Deviation (MAD)0
Skewness2.023718151
Sum6462131
Variance0.6963654641
MonotonicityNot monotonic
2022-08-24T20:34:04.651864image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
13575970
74.4%
3665277
 
13.8%
2262543
 
5.5%
462082
 
1.3%
618041
 
0.4%
51734
 
< 0.1%
(Missing)222349
 
4.6%
ValueCountFrequency (%)
13575970
74.4%
2262543
 
5.5%
3665277
 
13.8%
462082
 
1.3%
51734
 
< 0.1%
618041
 
0.4%
ValueCountFrequency (%)
618041
 
0.4%
51734
 
< 0.1%
462082
 
1.3%
3665277
 
13.8%
2262543
 
5.5%
13575970
74.4%

cod_iluminacao_domic_fam
Real number (ℝ≥0)

MISSING

Distinct6
Distinct (%)< 0.1%
Missing222349
Missing (%)4.6%
Infinite0
Infinite (%)0.0%
Mean1.317827343
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:04.720680image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum6
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.9058275392
Coefficient of variation (CV)0.6873643534
Kurtosis13.16471092
Mean1.317827343
Median Absolute Deviation (MAD)0
Skewness3.515250495
Sum6043091
Variance0.8205235307
MonotonicityNot monotonic
2022-08-24T20:34:04.793453image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
13888367
80.9%
2279385
 
5.8%
3271003
 
5.6%
686391
 
1.8%
437906
 
0.8%
522595
 
0.5%
(Missing)222349
 
4.6%
ValueCountFrequency (%)
13888367
80.9%
2279385
 
5.8%
3271003
 
5.6%
437906
 
0.8%
522595
 
0.5%
686391
 
1.8%
ValueCountFrequency (%)
686391
 
1.8%
522595
 
0.5%
437906
 
0.8%
3271003
 
5.6%
2279385
 
5.8%
13888367
80.9%

cod_calcamento_domic_fam
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing222350
Missing (%)4.6%
Memory size36.7 MiB
1.0
2720352 
3.0
1579965 
2.0
285329 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters13756938
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row3.0
4th row3.0
5th row3.0

Common Values

ValueCountFrequency (%)
1.02720352
56.6%
3.01579965
32.9%
2.0285329
 
5.9%
(Missing)222350
 
4.6%

Length

2022-08-24T20:34:04.872273image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-24T20:34:04.948040image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
1.02720352
59.3%
3.01579965
34.5%
2.0285329
 
6.2%

Most occurring characters

ValueCountFrequency (%)
.4585646
33.3%
04585646
33.3%
12720352
19.8%
31579965
 
11.5%
2285329
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number9171292
66.7%
Other Punctuation4585646
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04585646
50.0%
12720352
29.7%
31579965
 
17.2%
2285329
 
3.1%
Other Punctuation
ValueCountFrequency (%)
.4585646
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common13756938
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.4585646
33.3%
04585646
33.3%
12720352
19.8%
31579965
 
11.5%
2285329
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII13756938
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.4585646
33.3%
04585646
33.3%
12720352
19.8%
31579965
 
11.5%
2285329
 
2.1%

cod_familia_indigena_fam
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size36.7 MiB
2.0
4782826 
1.0
 
25168

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters14423982
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row2.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.04782826
99.5%
1.025168
 
0.5%
(Missing)2
 
< 0.1%

Length

2022-08-24T20:34:05.020871image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-24T20:34:07.131076image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
2.04782826
99.5%
1.025168
 
0.5%

Most occurring characters

ValueCountFrequency (%)
.4807994
33.3%
04807994
33.3%
24782826
33.2%
125168
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number9615988
66.7%
Other Punctuation4807994
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04807994
50.0%
24782826
49.7%
125168
 
0.3%
Other Punctuation
ValueCountFrequency (%)
.4807994
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common14423982
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.4807994
33.3%
04807994
33.3%
24782826
33.2%
125168
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII14423982
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.4807994
33.3%
04807994
33.3%
24782826
33.2%
125168
 
0.2%

ind_familia_quilombola_fam
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing25170
Missing (%)0.5%
Memory size36.7 MiB
2.0
4753516 
1.0
 
29310

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters14348478
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2.0
2nd row2.0
3rd row2.0
4th row2.0
5th row2.0

Common Values

ValueCountFrequency (%)
2.04753516
98.9%
1.029310
 
0.6%
(Missing)25170
 
0.5%

Length

2022-08-24T20:34:07.195903image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-24T20:34:07.270702image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
2.04753516
99.4%
1.029310
 
0.6%

Most occurring characters

ValueCountFrequency (%)
.4782826
33.3%
04782826
33.3%
24753516
33.1%
129310
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number9565652
66.7%
Other Punctuation4782826
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
04782826
50.0%
24753516
49.7%
129310
 
0.3%
Other Punctuation
ValueCountFrequency (%)
.4782826
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common14348478
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
.4782826
33.3%
04782826
33.3%
24753516
33.1%
129310
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII14348478
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
.4782826
33.3%
04782826
33.3%
24753516
33.1%
129310
 
0.2%

nom_estab_assist_saude_fam
Categorical

HIGH CARDINALITY
MISSING

Distinct25584
Distinct (%)1.1%
Missing2441370
Missing (%)50.8%
Memory size36.7 MiB
CLINICA DA FAMILIA
 
19037
POSTO DE COLETA DE CODO I
 
4212
HOSPITAL MUNICIPAL JAMEL CECILIO ANAPOLIS
 
4014
UNIDADE DE SAUDE FAMILIAR COMUNITARIA
 
2451
HOSPITAL MUNICIPAL DE IPIRA
 
2044
Other values (25579)
2334868 

Length

Max length60
Median length44
Mean length29.54982874
Min length3

Characters and Unicode

Total characters69933393
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1795 ?
Unique (%)0.1%

Sample

1st rowUS CAMPO VERDE
2nd rowUNIDADE REGIONAL DE SAUDE SERRA
3rd rowUNIDADE BASICA DE SAUDE VILA NOVA DE COLARES
4th rowUNIDADE DE SAUDE DA FAMILIA DE ULISSES GUIMARAES
5th rowUNIDADE DE SAUDE DA FAMILIA DE TERRA VERMELHA

Common Values

ValueCountFrequency (%)
CLINICA DA FAMILIA19037
 
0.4%
POSTO DE COLETA DE CODO I4212
 
0.1%
HOSPITAL MUNICIPAL JAMEL CECILIO ANAPOLIS4014
 
0.1%
UNIDADE DE SAUDE FAMILIAR COMUNITARIA2451
 
0.1%
HOSPITAL MUNICIPAL DE IPIRA2044
 
< 0.1%
UBS DE SANTALUZ2021
 
< 0.1%
C S F ARGEU HERBSTER1783
 
< 0.1%
UNIDADE MISTA DE AFUA1775
 
< 0.1%
SECRETARIA MUNICIPAL DA SAUDE DE IJUI VIGILANCIA EM SAUDE1773
 
< 0.1%
CENTRO DE SAUDE SAO FRANCISCO1753
 
< 0.1%
Other values (25574)2325763
48.4%
(Missing)2441370
50.8%

Length

2022-08-24T20:34:07.367469image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
de1418339
 
11.3%
saude978408
 
7.8%
unidade584627
 
4.7%
da483610
 
3.9%
ubs438532
 
3.5%
familia349953
 
2.8%
centro305587
 
2.4%
psf209944
 
1.7%
usf202520
 
1.6%
basica185989
 
1.5%
Other values (11980)7349855
58.8%

Most occurring characters

ValueCountFrequency (%)
10140738
14.5%
A9127347
13.1%
E6243142
8.9%
D5530745
 
7.9%
I5234689
 
7.5%
S4842292
 
6.9%
O4121607
 
5.9%
U3498777
 
5.0%
R3367949
 
4.8%
N2997304
 
4.3%
Other values (27)14828803
21.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter59470859
85.0%
Space Separator10140738
 
14.5%
Decimal Number321796
 
0.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A9127347
15.3%
E6243142
10.5%
D5530745
9.3%
I5234689
8.8%
S4842292
 
8.1%
O4121607
 
6.9%
U3498777
 
5.9%
R3367949
 
5.7%
N2997304
 
5.0%
L2139306
 
3.6%
Other values (16)12367701
20.8%
Decimal Number
ValueCountFrequency (%)
160180
18.7%
257606
17.9%
351712
16.1%
050853
15.8%
530567
9.5%
423747
 
7.4%
712913
 
4.0%
612822
 
4.0%
911044
 
3.4%
810352
 
3.2%
Space Separator
ValueCountFrequency (%)
10140738
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin59470859
85.0%
Common10462534
 
15.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A9127347
15.3%
E6243142
10.5%
D5530745
9.3%
I5234689
8.8%
S4842292
 
8.1%
O4121607
 
6.9%
U3498777
 
5.9%
R3367949
 
5.7%
N2997304
 
5.0%
L2139306
 
3.6%
Other values (16)12367701
20.8%
Common
ValueCountFrequency (%)
10140738
96.9%
160180
 
0.6%
257606
 
0.6%
351712
 
0.5%
050853
 
0.5%
530567
 
0.3%
423747
 
0.2%
712913
 
0.1%
612822
 
0.1%
911044
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII69933393
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10140738
14.5%
A9127347
13.1%
E6243142
8.9%
D5530745
 
7.9%
I5234689
 
7.5%
S4842292
 
6.9%
O4121607
 
5.9%
U3498777
 
5.0%
R3367949
 
4.8%
N2997304
 
4.3%
Other values (27)14828803
21.2%

cod_eas_fam
Real number (ℝ≥0)

MISSING

Distinct27005
Distinct (%)1.1%
Missing2441370
Missing (%)50.8%
Infinite0
Infinite (%)0.0%
Mean3075005.845
Minimum19
Maximum9630546
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:07.485128image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum19
5-th percentile24473
Q12290278
median2533979
Q33182738
95-th percentile6874053
Maximum9630546
Range9630527
Interquartile range (IQR)892460

Descriptive statistics

Standard deviation1707394.582
Coefficient of variation (CV)0.5552492151
Kurtosis1.603776376
Mean3075005.845
Median Absolute Deviation (MAD)297072
Skewness1.245833164
Sum7.277388783 × 1012
Variance2.915196258 × 1012
MonotonicityNot monotonic
2022-08-24T20:34:07.593868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
631035419031
 
0.4%
30234354212
 
0.1%
23617444014
 
0.1%
26539662451
 
0.1%
40266402044
 
< 0.1%
25110882021
 
< 0.1%
24823471783
 
< 0.1%
23160481775
 
< 0.1%
68595341773
 
< 0.1%
24820881710
 
< 0.1%
Other values (26995)2325812
48.4%
(Missing)2441370
50.8%
ValueCountFrequency (%)
1933
 
< 0.1%
3510
 
< 0.1%
43386
< 0.1%
51400
< 0.1%
8618
 
< 0.1%
108127
 
< 0.1%
116120
 
< 0.1%
124214
< 0.1%
132181
< 0.1%
140156
 
< 0.1%
ValueCountFrequency (%)
96305463
< 0.1%
96186942
< 0.1%
96149901
 
< 0.1%
96147451
 
< 0.1%
95984051
 
< 0.1%
95976031
 
< 0.1%
95905601
 
< 0.1%
95750652
< 0.1%
95735503
< 0.1%
95725111
 
< 0.1%

nom_centro_assist_fam
Categorical

HIGH CARDINALITY
MISSING

Distinct3699
Distinct (%)0.2%
Missing3031030
Missing (%)63.0%
Memory size36.7 MiB
CRAS CENTRO DE REFERENCIA DE ASSISTENCIA SOCIAL
 
39099
CRAS CENTRO
 
34910
CRAS CENTRO DE REFERENCIA DA ASSISTENCIA SOCIAL
 
26266
CRAS
 
16197
CENTRO DE REFERENCIA DE ASSISTENCIA SOCIAL
 
14565
Other values (3694)
1645929 

Length

Max length70
Median length62
Mean length22.31553277
Min length4

Characters and Unicode

Total characters39653943
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186 ?
Unique (%)< 0.1%

Sample

1st rowCRAS DE SERRA SEDE
2nd rowCRAS VIANA
3rd rowCRAS IV ALTO MUCURI
4th rowCRAS III CAMPO VERDE
5th rowCRAS DE VILA NOVA DE COLARES

Common Values

ValueCountFrequency (%)
CRAS CENTRO DE REFERENCIA DE ASSISTENCIA SOCIAL39099
 
0.8%
CRAS CENTRO34910
 
0.7%
CRAS CENTRO DE REFERENCIA DA ASSISTENCIA SOCIAL26266
 
0.5%
CRAS16197
 
0.3%
CENTRO DE REFERENCIA DE ASSISTENCIA SOCIAL14565
 
0.3%
CRAS I14545
 
0.3%
CRAS CENTRAL14418
 
0.3%
CENTRO DE REFERENCIA DA ASSISTENCIA SOCIAL13481
 
0.3%
CRAS CASA DA FAMILIA10952
 
0.2%
CRAS GRAJAU10554
 
0.2%
Other values (3689)1581979
32.9%
(Missing)3031030
63.0%

Length

2022-08-24T20:34:07.717537image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
cras1688249
25.3%
de459637
 
6.9%
centro269556
 
4.0%
social215685
 
3.2%
referencia214237
 
3.2%
assistencia210105
 
3.1%
da170611
 
2.6%
sao76547
 
1.1%
casa71732
 
1.1%
i69431
 
1.0%
Other values (3147)3224333
48.3%

Most occurring characters

ValueCountFrequency (%)
A6010432
15.2%
4893157
12.3%
R4004524
10.1%
S3703775
9.3%
C3349080
8.4%
E3208257
8.1%
I2885801
7.3%
O2200027
 
5.5%
N1718131
 
4.3%
D1413487
 
3.6%
Other values (28)6267272
15.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter34691229
87.5%
Space Separator4893157
 
12.3%
Decimal Number69048
 
0.2%
Connector Punctuation509
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A6010432
17.3%
R4004524
11.5%
S3703775
10.7%
C3349080
9.7%
E3208257
9.2%
I2885801
8.3%
O2200027
 
6.3%
N1718131
 
5.0%
D1413487
 
4.1%
T1229859
 
3.5%
Other values (16)4967856
14.3%
Decimal Number
ValueCountFrequency (%)
020193
29.2%
117123
24.8%
29704
14.1%
37508
 
10.9%
47444
 
10.8%
71996
 
2.9%
81887
 
2.7%
61376
 
2.0%
51091
 
1.6%
9726
 
1.1%
Space Separator
ValueCountFrequency (%)
4893157
100.0%
Connector Punctuation
ValueCountFrequency (%)
_509
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin34691229
87.5%
Common4962714
 
12.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A6010432
17.3%
R4004524
11.5%
S3703775
10.7%
C3349080
9.7%
E3208257
9.2%
I2885801
8.3%
O2200027
 
6.3%
N1718131
 
5.0%
D1413487
 
4.1%
T1229859
 
3.5%
Other values (16)4967856
14.3%
Common
ValueCountFrequency (%)
4893157
98.6%
020193
 
0.4%
117123
 
0.3%
29704
 
0.2%
37508
 
0.2%
47444
 
0.1%
71996
 
< 0.1%
81887
 
< 0.1%
61376
 
< 0.1%
51091
 
< 0.1%
Other values (2)1235
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII39653943
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A6010432
15.2%
4893157
12.3%
R4004524
10.1%
S3703775
9.3%
C3349080
8.4%
E3208257
8.1%
I2885801
7.3%
O2200027
 
5.5%
N1718131
 
4.3%
D1413487
 
3.6%
Other values (28)6267272
15.8%

cod_centro_assist_fam
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct5639
Distinct (%)0.3%
Missing3031030
Missing (%)63.0%
Infinite0
Infinite (%)0.0%
Mean3.16930602 × 1010
Minimum1.10001204 × 1010
Maximum5.300109833 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:07.834194image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1.10001204 × 1010
5-th percentile1.40010365 × 1010
Q12.510800679 × 1010
median3.304550065 × 1010
Q33.550303288 × 1010
95-th percentile5.006600139 × 1010
Maximum5.300109833 × 1010
Range4.200097794 × 1010
Interquartile range (IQR)1.039502609 × 1010

Descriptive statistics

Standard deviation9529194109
Coefficient of variation (CV)0.3006713158
Kurtosis-0.155210341
Mean3.16930602 × 1010
Median Absolute Deviation (MAD)5981499146
Skewness-0.04535792694
Sum5.63174904 × 1016
Variance9.080554037 × 1019
MonotonicityNot monotonic
2022-08-24T20:34:07.941382image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.550303288 × 101010554
 
0.2%
3.550300162 × 101010447
 
0.2%
3.550300165 × 10108915
 
0.2%
3.550303289 × 10107656
 
0.2%
2.30370012 × 10107220
 
0.2%
3.550300177 × 10106999
 
0.1%
3.550300167 × 10106998
 
0.1%
3.55030018 × 10106291
 
0.1%
3.550300168 × 10106144
 
0.1%
3.550300163 × 10106000
 
0.1%
Other values (5629)1699742
35.4%
(Missing)3031030
63.0%
ValueCountFrequency (%)
1.10001204 × 1010357
< 0.1%
1.100020668 × 1010350
 
< 0.1%
1.100051504 × 101056
 
< 0.1%
1.100061099 × 101026
 
< 0.1%
1.100070427 × 101098
 
< 0.1%
1.100081528 × 1010267
 
< 0.1%
1.100092022 × 101012
 
< 0.1%
1.100111019 × 1010887
< 0.1%
1.100112041 × 1010617
< 0.1%
1.10012039 × 101066
 
< 0.1%
ValueCountFrequency (%)
5.300109833 × 101027
 
< 0.1%
5.300109755 × 101014
 
< 0.1%
5.300109754 × 10104
 
< 0.1%
5.30010967 × 10102
 
< 0.1%
5.30010373 × 1010406
 
< 0.1%
5.300103611 × 1010193
 
< 0.1%
5.300103566 × 10101250
< 0.1%
5.300103514 × 1010332
 
< 0.1%
5.300103512 × 1010383
 
< 0.1%
5.300102047 × 1010359
 
< 0.1%

ind_parc_mds_fam
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing155334
Missing (%)3.2%
Infinite0
Infinite (%)0.0%
Mean19.30041168
Minimum0
Maximum306
Zeros4244511
Zeros (%)88.3%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:08.032140image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile205
Maximum306
Range306
Interquartile range (IQR)0

Descriptive statistics

Standard deviation63.21474913
Coefficient of variation (CV)3.275305739
Kurtosis8.191045388
Mean19.30041168
Median Absolute Deviation (MAD)0
Skewness3.102432982
Sum89798292
Variance3996.104508
MonotonicityNot monotonic
2022-08-24T20:34:08.104945image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
04244511
88.3%
205263094
 
5.5%
20243216
 
0.9%
30125218
 
0.5%
20424160
 
0.5%
30622809
 
0.5%
30310342
 
0.2%
2018732
 
0.2%
3054535
 
0.1%
3042462
 
0.1%
Other values (3)3583
 
0.1%
(Missing)155334
 
3.2%
ValueCountFrequency (%)
04244511
88.3%
1011810
 
< 0.1%
2018732
 
0.2%
20243216
 
0.9%
2031041
 
< 0.1%
20424160
 
0.5%
205263094
 
5.5%
30125218
 
0.5%
302732
 
< 0.1%
30310342
 
0.2%
ValueCountFrequency (%)
30622809
 
0.5%
3054535
 
0.1%
3042462
 
0.1%
30310342
 
0.2%
302732
 
< 0.1%
30125218
 
0.5%
205263094
5.5%
20424160
 
0.5%
2031041
 
< 0.1%
20243216
 
0.9%

marc_pbf
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size36.7 MiB
1
2424434 
0
2383562 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters4807996
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row1

Common Values

ValueCountFrequency (%)
12424434
50.4%
02383562
49.6%

Length

2022-08-24T20:34:08.188720image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Category Frequency Plot

2022-08-24T20:34:08.265531image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
12424434
50.4%
02383562
49.6%

Most occurring characters

ValueCountFrequency (%)
12424434
50.4%
02383562
49.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number4807996
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
12424434
50.4%
02383562
49.6%

Most occurring scripts

ValueCountFrequency (%)
Common4807996
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
12424434
50.4%
02383562
49.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII4807996
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12424434
50.4%
02383562
49.6%

qtde_pessoas
Real number (ℝ≥0)

Distinct19
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.687830855
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:08.332368image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median2
Q34
95-th percentile5
Maximum31
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.442161182
Coefficient of variation (CV)0.53655206
Kurtosis1.471999579
Mean2.687830855
Median Absolute Deviation (MAD)1
Skewness0.9751546479
Sum12923080
Variance2.079828875
MonotonicityNot monotonic
2022-08-24T20:34:08.405173image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
21300664
27.1%
31155777
24.0%
11122134
23.3%
4726300
15.1%
5309943
 
6.4%
6120289
 
2.5%
744831
 
0.9%
817188
 
0.4%
96752
 
0.1%
102600
 
0.1%
Other values (9)1518
 
< 0.1%
ValueCountFrequency (%)
11122134
23.3%
21300664
27.1%
31155777
24.0%
4726300
15.1%
5309943
 
6.4%
6120289
 
2.5%
744831
 
0.9%
817188
 
0.4%
96752
 
0.1%
102600
 
0.1%
ValueCountFrequency (%)
311
 
< 0.1%
183
 
< 0.1%
173
 
< 0.1%
165
 
< 0.1%
1518
 
< 0.1%
1438
 
< 0.1%
13122
 
< 0.1%
12337
 
< 0.1%
11991
 
< 0.1%
102600
0.1%

peso.fam
Real number (ℝ≥0)

Distinct886
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.21170859 × 1014
Minimum5.501656235 × 1012
Maximum5.504777045 × 1014
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size36.7 MiB
2022-08-24T20:34:08.501883image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum5.501656235 × 1012
5-th percentile5.502917229 × 1013
Q15.502215892 × 1014
median5.502451463 × 1014
Q35.502540234 × 1014
95-th percentile5.503448472 × 1014
Maximum5.504777045 × 1014
Range5.449760482 × 1014
Interquartile range (IQR)3.243417885 × 1010

Descriptive statistics

Standard deviation1.170883957 × 1014
Coefficient of variation (CV)0.2246641263
Kurtosis12.33229392
Mean5.21170859 × 1014
Median Absolute Deviation (MAD)1.243471773 × 1010
Skewness-3.783180996
Sum-2.969788772 × 1018
Variance1.370969241 × 1028
MonotonicityNot monotonic
2022-08-24T20:34:08.607631image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.502451463 × 1014839807
 
17.5%
5.502443092 × 1014233230
 
4.9%
5.50245607 × 1014121686
 
2.5%
5.50243164 × 101498360
 
2.0%
5.502477921 × 101473396
 
1.5%
5.502458232 × 101466634
 
1.4%
5.502484072 × 101360027
 
1.2%
5.502427962 × 101450397
 
1.0%
5.502457847 × 101449254
 
1.0%
5.502430102 × 101445467
 
0.9%
Other values (876)3169738
65.9%
ValueCountFrequency (%)
5.501656235 × 10122427
 
0.1%
5.501682233 × 10121464
 
< 0.1%
5.5018315 × 10121769
 
< 0.1%
5.502526708 × 101215369
0.3%
5.50304564 × 1012508
 
< 0.1%
5.503155193 × 10121941
 
< 0.1%
5.503181124 × 10123271
 
0.1%
5.50442385 × 10121180
 
< 0.1%
5.500455455 × 10131312
 
< 0.1%
5.500467336 × 10131028
 
< 0.1%
ValueCountFrequency (%)
5.504777045 × 10146
 
< 0.1%
5.504449465 × 10141337
< 0.1%
5.504443495 × 10141082
< 0.1%
5.504410327 × 1014983
 
< 0.1%
5.504395098 × 10141186
< 0.1%
5.504391327 × 10141053
< 0.1%
5.504389448 × 10141292
< 0.1%
5.50438383 × 10141104
< 0.1%
5.504381965 × 10142465
0.1%
5.504380104 × 10141177
< 0.1%

Interactions

2022-08-24T20:32:37.648521image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:24.964188image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:45.578176image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:05.124876image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:25.527254image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:45.884756image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:06.015016image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:25.158355image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:43.760285image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:01.631006image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:19.768451image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:37.467340image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:49.053343image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:59.077695image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:16.815714image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:39.088667image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:26.703801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:46.927535image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:06.574965image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:27.013279image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:47.319916image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:07.343524image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:26.424966image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:45.023868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:02.887618image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:21.003148image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:38.285139image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:49.652733image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:00.345274image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:18.317040image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:40.525821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:28.346416image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:48.359738image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:08.122823image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:28.501297image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:48.802948image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:08.764721image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:27.771364image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:46.304473image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:04.207088image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:22.318628image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:39.115899image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:50.294987image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:01.617863image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:19.868887image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:41.946021image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:29.852660image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:49.796891image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:09.537074image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:29.984360image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:50.295985image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:10.229833image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:29.102802image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:47.607955image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:05.474697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:23.624146image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:40.011528image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:50.955220image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:02.848571image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:21.370868image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:43.377192image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:31.315308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:51.249968image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:11.041014image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:31.421483image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:51.709172image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:11.632049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:30.390357image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:48.932445image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:06.875946image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:25.022608image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:40.844273image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:51.579549image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:04.115556image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:22.832958image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:44.810358image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:32.814535image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:52.642244image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:12.485190image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:32.923499image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:53.169141image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:12.989418image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:31.789613image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:50.254872image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:08.194450image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:26.310193image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:41.634199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:52.239783image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:05.358231image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:24.309007image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:46.234546image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:34.273633image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:53.985649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:13.961201image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:34.356629image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:54.610287image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:14.328835image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:33.163970image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:51.473643image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:09.434130image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:27.601707image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:42.418063image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:52.915973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:06.621200image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:25.740177image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:47.535067image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:35.636984image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:55.261236image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:15.309625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:35.736967image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:55.965625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:15.649322image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:34.371406image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:52.738228image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:10.678771image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:28.796545image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:43.178030image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:53.540477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:07.792067image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:27.142631image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:48.905400image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:37.035241image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:56.601685image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:16.726836image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:37.179110image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:57.382831image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:17.001683image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:35.912247image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:53.962533image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:11.885542image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:30.029244image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:43.990855image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:54.213674image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:09.065486image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:28.652591image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:50.301666image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:38.387623image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:57.952071image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:18.142049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:38.566442image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:58.846914image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:18.337109image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:37.227728image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:55.254043image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:13.160132image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:31.284887image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:44.740886image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:54.857951image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:10.277210image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:30.167537image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:51.082607image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:39.242337image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:58.722973image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:18.995731image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:39.397220image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:59.666722image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:19.115028image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:37.985701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:55.990111image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:13.896176image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:32.002964image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:45.499854image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:55.349669image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:11.008252image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:31.046186image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:51.687957image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:39.911546image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:59.332374image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:19.640033image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:40.062439image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:00.304050image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:19.719907image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:38.553181image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:56.552603image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:14.454669image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:32.560472image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:46.046354image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:55.946040image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:11.556786image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:31.686473image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:53.040336image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:41.331533image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:00.816407image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:21.038265image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:41.380911image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:01.680650image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:21.028440image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:39.801840image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:57.780284image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:15.698341image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:33.802150image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:46.814299image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:56.519536image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:12.825391image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:33.136625image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:54.400697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:42.831060image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:02.295416image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:22.562188image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:42.875948image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:03.187580image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:22.417690image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:41.185172image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:59.096762image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:17.146466image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:35.323047image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:47.632111image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:57.189712image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:14.175778image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:34.545821image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:55.735126image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:28:44.260697image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:03.615881image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:24.098113image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:29:44.425764image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:04.642687image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:23.833901image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:30:42.445801image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:00.333457image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:18.436015image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:36.592649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:48.432969image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:31:57.831031image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:15.398508image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-08-24T20:32:36.304117image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-08-24T20:34:08.761315image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-08-24T20:34:09.024642image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-08-24T20:34:09.286909image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-08-24T20:34:09.533250image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-08-24T20:34:09.698808image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-08-24T20:32:58.999199image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-08-24T20:33:11.688644image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-08-24T20:33:47.592465image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-08-24T20:33:53.143796image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

cd_ibgeestratoclassfid_familiadat_cadastramento_famdat_alteracao_famvlr_renda_media_famdat_atualizacao_familiacod_local_domic_famcod_especie_domic_famqtd_comodos_domic_famqtd_comodos_dormitorio_famcod_material_piso_famcod_material_domic_famcod_agua_canalizada_famcod_abaste_agua_domic_famcod_banheiro_domic_famcod_escoa_sanitario_domic_famcod_destino_lixo_domic_famcod_iluminacao_domic_famcod_calcamento_domic_famcod_familia_indigena_famind_familia_quilombola_famnom_estab_assist_saude_famcod_eas_famnom_centro_assist_famcod_centro_assist_famind_parc_mds_fammarc_pbfqtde_pessoaspeso.fam
03205002221.02018-06-282018-10-02244.02018-06-281.01.05.02.05.01.01.01.01.01.02.01.01.02.02.0NaNNaNCRAS DE SERRA SEDE3.205003e+100.005550256458545518
13205101223.02018-08-272018-11-2960.02018-11-291.01.05.02.05.01.01.01.01.01.01.01.01.02.02.0NaNNaNCRAS VIANA3.205103e+100.015550355704647837
23201308224.02018-02-232018-02-27937.02018-02-231.01.04.01.02.02.01.01.01.01.01.01.03.02.02.0NaNNaNCRAS IV ALTO MUCURI3.201300e+100.001550259704488172
33201308226.02013-12-272018-10-0144.02017-06-221.01.04.01.02.02.01.01.01.01.01.02.03.02.02.0US CAMPO VERDE2652994.0CRAS III CAMPO VERDE3.201300e+100.012550259704488172
43205002227.02018-03-262018-03-280.02018-03-261.01.04.01.05.01.02.04.01.05.03.01.03.02.02.0UNIDADE REGIONAL DE SAUDE SERRA2465795.0NaNNaN0.012550256458545518
53205002228.02016-10-272018-10-01176.02016-10-271.01.06.03.05.01.01.01.01.01.01.01.02.02.02.0UNIDADE BASICA DE SAUDE VILA NOVA DE COLARES2522845.0CRAS DE VILA NOVA DE COLARES3.205000e+100.015550256458545518
63205200229.02015-06-162018-10-01312.02018-03-201.01.05.02.05.01.01.01.01.03.01.01.03.02.02.0UNIDADE DE SAUDE DA FAMILIA DE ULISSES GUIMARAES3346501.0CRAS JABAETE3.205202e+100.003550245146328323
732013082210.02017-04-052018-10-01954.02018-07-041.01.01.01.05.01.01.01.01.01.01.02.01.02.02.0NaNNaNCRAS VII SOTELANDIA3.201304e+100.001550259704488172
832052002211.02018-10-032018-10-15477.02018-10-031.01.05.02.05.01.01.01.01.01.01.01.01.02.02.0UNIDADE DE SAUDE DA FAMILIA DE TERRA VERMELHA2403412.0CRAS MORADA DA BARRA3.205200e+100.002550245146328323
932050022212.02016-05-112016-05-114.02016-05-111.01.01.01.05.01.01.01.01.01.01.01.01.02.02.0NaNNaNNaNNaN0.013550256458545518

Last rows

cd_ibgeestratoclassfid_familiadat_cadastramento_famdat_alteracao_famvlr_renda_media_famdat_atualizacao_familiacod_local_domic_famcod_especie_domic_famqtd_comodos_domic_famqtd_comodos_dormitorio_famcod_material_piso_famcod_material_domic_famcod_agua_canalizada_famcod_abaste_agua_domic_famcod_banheiro_domic_famcod_escoa_sanitario_domic_famcod_destino_lixo_domic_famcod_iluminacao_domic_famcod_calcamento_domic_famcod_familia_indigena_famind_familia_quilombola_famnom_estab_assist_saude_famcod_eas_famnom_centro_assist_famcod_centro_assist_famind_parc_mds_fammarc_pbfqtde_pessoaspeso.fam
48079863550308215290692.02018-11-222018-11-221400.02018-11-221.01.02.01.05.01.01.01.01.01.01.01.01.02.02.0NaNNaNCRAS JACANA3.550300e+100.001550244309203512
48079873550308215290693.02018-01-102018-10-01468.02018-01-101.01.04.01.05.01.01.01.01.01.01.01.01.02.02.0UBS J VISTA ALEGRE2787946.0CRAS BRASILANDIA II3.550304e+100.002550244309203512
48079883550308215290694.02015-02-122018-10-1530.02018-10-151.01.05.02.05.01.01.01.01.01.01.01.01.02.02.0NaNNaNCRAS ITAIM PAULISTA II3.550304e+100.015550244309203512
48079893550308215290695.02014-12-032018-10-01436.02017-11-211.01.03.01.05.01.01.01.01.01.01.02.01.02.02.0NaNNaNCRAS ARTHUR ALVIM3.550303e+100.003550244309203512
48079903550308215290696.02017-09-132018-10-24217.02018-10-241.01.03.01.05.01.01.01.01.01.01.01.01.02.02.0AMA UBS INTEGRADA JARDIM HELENA4049934.0CRAS SAO MIGUEL3.550300e+100.003550244309203512
48079913550308215290697.02018-07-302018-10-021129.02018-07-301.01.02.01.05.01.01.01.01.01.01.01.01.02.02.0NaNNaNCRAS CIDADE LIDER3.550303e+100.001550244309203512
48079923550308215290698.02018-02-162018-10-010.02018-02-161.01.08.05.05.01.01.01.01.01.01.01.01.02.02.0UBS INTEGRAL JARDIM MIRIAM II7128940.0CRAS CIDADE ADEMAR II3.550304e+100.011550244309203512
48079933550308215290699.02014-10-092018-10-01162.02017-10-041.01.03.01.05.01.01.01.01.01.01.01.01.02.02.0NaNNaNCRAS JABAQUARA3.550300e+100.004550244309203512
48079943550308215290700.02006-05-242018-09-3083.02017-08-151.01.03.01.05.01.01.01.01.01.01.03.01.02.02.0UBS J NAKAMURA2787644.0CRAS M BOI MIRIM3.550300e+100.011550244309203512
48079953550308215290701.02015-05-142018-10-01445.02017-08-021.01.04.02.05.01.01.01.01.01.01.01.01.02.02.0UBS VARGINHA2789299.0CRAS GRAJAU3.550303e+100.003550244309203512